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1.  INTRODUCTION 


A  variety  of  new  three-dimensional  models  of  the  Earth's  seismic  velocity  structure  have 
been  generated  in  the  last  decade,  using  more  sophisticated  numerical  techniques  and  an  ever 
increasing  amount  of  high  quality  data.  Some  of  these  velocity  models  are  designed  to  illuminate 
the  dynamic  structure  of  the  Earth's  internal  mechanics,  while  others  are  primarily  used  for  their 
predictive  qualities  in  various  seismic  wavefield  phenomena.  For  example,  accurate  travel-time 
predictions  for  regional  seismic  phases  are  essential  for  locating  seismic  events  with  the  accuracy 
needed  for  nuclear  monitoring  decisions.  The  events  of  interest  to  monitoring  seismologists  are 
those  at  lower  magnitudes,  which  will  more  than  likely  only  be  recorded  on  a  sparse  network  of 
stations  at  regional  distances  less  than  approximately  20°.  Under  these  conditions,  systematic 
biases  caused  by  inadequately  modeled  Earth  structures  cause  errors  in  the  estimation  of  travel 
times  and  amplitudes  of  seismic  phases.  More  accurate  and  reliable  estimates  of  these  quantities 
(especially  in  aseismic  regions)  have  great  potential  to  improve  nuclear  monitoring  efforts  to 
detect,  locate  and  discriminate  regional  events.  Travel  times  calculated  through  a  three- 
dimensional  (3D)  Earth  model  have  the  best  chance  of  achieving  acceptable  prediction  errors,  if 
the  model  is  constrained  by  sufficient  data.  Previous  work  has  shown  that  incorporating 
predictions  from  3D  models  significantly  improves  the  location  of  regional  events.  For  instance, 
Ritzwoller  et  al.  (2003)  demonstrated  that  a  shear-wave  model  developed  from  broadband 
surface-wave  measurements  could  be  used  to  improve  the  predictions  of  Pn  and  P  travel  times, 
and  subsequent  regional  event  locations,  using  a  thermoelastic  conversion  from  shear  to 
compressional  velocities.  Flanagan  et  al.  (2007)  also  demonstrated  regional  location 
improvement  using  an  a  priori  model  of  Western  Eurasia  and  North  Africa  (WENA  1.0; 
Pasyanos  et  al.,  2004). 

The  difficulty  of  converting  an  S  velocity  model  to  a  P  veloicty  model,  or  vice  versa,  in  order 
to  get  a  complete  seismic  velocity  model,  has  been  a  persistent  problem  that  has  hampered 
previous  efforts  to  develop  regional  crust  and  upper  mantle  models.  In  some  instances,  empirical 
scaling  relations  (e.g.  Poisson's  ratio)  are  assumed  for  the  crust  and  upper  mantle,  and  the  S 
model  is  then  converted  to  a  P  model.  However,  in  regions  with  sedimentary  basins  or 
anomalous  mantle  structures,  the  assumed  relationship  may  be  wrong,  leading  to  errors  in  the 
velocity  model  derived  from  the  conversion.  In  other  cases,  additional  geophysical  data  sets, 
such  as  receiver  functions  or  gravity,  can  be  used  to  complement  the  results  derived  from  surface 
waves  (e.g.  Ammon  et  al.,  1990;  Julia  et  al.,  2000).  A  significant  drawback  for  these  efforts  can 
be  the  discrepancy  between  the  resolution  or  spatial  coverage  of  the  various  data  sets.  At  the 
global  teleseismic  scale,  several  researchers  have  utilized  body-wave  travel  times  and  surface- 
wave  dispersion  measurements  (among  other  data)  jointly  to  produce  P  and  S  models  of  the 
Earth's  mantle  (e.g.,  Su  and  Dziewonski,  1997;  Masters  et  al.,  2000;  Antolik  et  al.,  2003). 
However,  a  joint  inversion  approach  for  detailed  regional  crust  and  upper-mantle  P  and  S 
velocity  models  has  not  yet  been  attempted. 

With  this  motivation,  we  have  developed  a  self-consistent  3D  P  and  S  velocity  model  of  the 
crust  and  upper  mantle  in  a  large  region  of  central  and  southern  Asia.  The  new  model  is  the 
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result  of  a  nonlinear,  joint  body-wave/surface-wave  inversion  method  applied  to  Pn  travel  times 
and  Rayleigh  group-velocity  measurements.  The  body-  and  surface-wave  data  used  in  the 
inversion  were  specifically  chosen  for  their  complementary  sensitivity  to  the  velocity  structures 
at  crust  and  upper  mantle  depths.  Consistency  between  the  P  and  S  velocities  is  achieved  by 
imposing  bounds  on  Poisson's  ratio  and  by  invoking  a  regularization  constraint  that  correlates 
variations  in  P  and  S  velocity  from  an  initial  model.  We  report  on  the  application  of  the  new 
inversion  technique  to  data  observed  in  the  broad  region  shown  in  Figure  1.  This  region, 
extending  from  0-60°  N  and  30-120°  E,  covers  some  of  the  most  tectonically  complex  areas  on 
Earth.  The  inversion  model  extends  from  the  surface  to  approximately  300  km  beneath  the 
surface. 


To  evaluate  the  usefulness  of  the  new  model  for  locating  small  regional  events  we  have 
performed  several  validation  exercises  using  a  set  of  earthquakes  and  explosions  with  well- 
known  epicentral  locations.  These  tests  include  travel-time  predictions  and  relocations  of 
ground-truth  events  in  the  study  area  using  both  P  and  S  regional  phase  arrivals.  Our  validation 
results  indicate  that  in  many  cases  our  3D  model  achieves  excellent  travel-time  prediction  and 
epicentral  accuracy. 


Figure  1:  A  topographic  map  of  the  study  region,  which  encompasses  most  of  central  and 
southern  Asia  and  portions  of  the  Middle  East.  Dashed  black  lines  indicate  major  plate  boundaries, 
and  labels  denote  some  major  tectonic  features  mentioned  in  the  text.  The  thick  white  dashed  line 
outlines  the  area  covered  by  the  new  inversion  model. 
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2.  MODELING  AND  INVERSION  METHODS 


2.1  Model  Parameterization 

A  complete  3D  model  for  our  purposes  comprises  five  3D  parameter  functions:  a  (P 
velocity),  6  (S  velocity),  p  (density),  Qa  and  Qe  (quality  factors  for  P  and  S  velocity).  We 
represent  these  as  a  latitude/longitude  grid  of  vertical  profiles,  all  sharing  a  common 
parameterization  with  respect  to  depth.  Namely,  the  crust  at  each  latitude/longitude  point  is 
divided  into  six,  vertically  homogeneous  layers  corresponding  to  those  of  the  CRUST2.0  model 
(Laske  et  al.,  2001):  water  overlying  two  sediment  layers  (soft  and  hard  sediments)  overlying 
three  igneous/metamorphic  layers  (upper,  middle  and  lower  crust).  The  thicknesses  of  the  crustal 
layers  are  allowed  to  vary  between  profiles.  The  upper  mantle  is  represented  vertically  by 
piecewise  linear  parameter  functions  sampled  at  17  nodes  distributed  between  the  Moho 
discontinuity  and  a  depth  of  410  km.  The  parameters  are  discontinuous  across  the  Moho. 


2.2  Forward  Modeling 

The  forward  problem  in  our  inversion  methods  requires  prediction  of  first-arriving  body- 
wave  travel  times  and  frequency-dependent  surface-wave  group  delays  in  a  three-dimensional 
(3D)  Earth  model.  We  calculate  body-wave  travel  times  with  the  finite-difference  method  of 
Podvin  and  Lecomte  (1991)  as  implemented  by  Lomax  et  al.  (2000).  We  extended  the  Podvin- 
Lecomte  (P-L)  software  with  algorithms  for  mapping  Earth  models  in  spherical  coordinates  to 
flat-Earth,  Cartesian  models,  and  for  generating  the  partial  derivatives  (sensitivities)  of  the 
calculated  travel  times  with  respect  to  the  P-wave  velocity  model  parameters. 


Our  method  for  modeling  group  delays  assumes  that  the  phase  delay  of  a  Rayleigh  wave  at  a 
fixed  frequency  a>  is  a  line  integral  of  a  local  phase  velocity  along  a  path  connecting  the  source 
and  receiver  locations.  Denoting  the  local  phase-velocity  function,  or  map,  as  c{6,(p;co),  the 
propagation  path  as  Ym,  and  phase  delay  as  fh{co),  we  have 


tvh(a)  =  Jr 

1  CO 


ds 

c(0,<b;coy 


(i) 


where  s  is  the  distance  along  the  path.  This  equation  assumes  there  is  no  static  phase  attributable 
to  source  excitation  or  receiver  effects.  Previous  workers  in  surface-wave  tomography  have 
generally  taken  rm  to  be  the  great-circle  arc  connecting  the  source  and  receiver  (e.g.  Woodhouse 
and  Dziewonski,  1984).  Bukchin  et  al.  (2006)  have  proposed  that  a  better  approximation,  that 
accounts  for  the  focusing  effects  due  to  lateral  variations  in  local  phase  velocity,  is  achieved  with 
rw  taken  as  the  geometrical  ray  (Fermat  path)  through  the  phase-velocity  map,  although  the 
authors  stress  that  the  effects  of  strong  lateral  variations  can  only  be  accurately  modeled  with  a 
numerical  wave-equation  or  mode-coupling  approach.  Our  modeling  technique  adopts  the 
Fermat  approach,  implemented  with  a  2D  version  of  the  Podvin-Lecomte  finite-difference 
method  to  obtain  r(J. 


The  group  delay  between  a  source  and  receiver  is  defined  as 


(2) 
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We  can  apply  Equation  (2)  to  equation  (1)  -  ignoring  the  dependence  of  rw  on  a )  since  tph(co)  is 
stationary  with  respect  to  path  perturbations — and  obtain 


tzr(co) 


ds 

u(0,4>;o))' 


(3) 


where  u(6, 4>;  o>)  is  the  group-velocity  map  at  frequency  <x>,  given  by 

— i —  =  A.  ( — " — ).  (4 

u(Q,§](x))  dco  Kc(0,§)(x)y  v 

We  obtain  the  phase-velocity  and  group-velocity  maps  for  a  3D  Earth  model  with  point-wise 
dispersion  calculations  in  ID  Earth  models  (Herrmann,  2002).  That  is,  for  a  fixed  geographic 
location  (0, 4>),  c(9, 4>;  to)  and  u(Q,  4>;o>)  are  calculated  from  the  spherically  symmetric  Earth 
model  whose  depth-dependent  velocities  and  density  are  given  by  the  vertical  profile  at  (0,  (ji): 
a(6, 4>,z),  etc.  Implicit  in  these  calculations  are  an  earth- flattening  transformation  that  corrects 
for  the  Earth's  sphericity  and  corrections  for  anelasticity  using  the  relationships  developed  by  Liu 
et  al.  (1976)  and  Yu  and  Mitchell  (1979). 


2.3  Joint  Inversion  Approach 

Our  inversion  procedure  iteratively  updates  an  initial,  or  prior,  model  with  the  objective  of 
fitting  both  P-wave  travel-time  and  Rayleigh- wave  group-delay  observations,  subject  to 
smoothness  and  other  constraints  on  the  model  parameters.  The  prior  model  is  a  composite  3D 
model  consisting  of  CRUST2.0  for  the  crust  and  the  1-D  AK135  reference  model  (Kennett  et  al., 
1995)  for  the  mantle.  The  CRUST2.0  Pn  and  Sn  velocities  were  ignored  in  favor  of  the  AK135 
velocities  (a  =  8.04  km/s,  /?  =  4.48  km/s  at  the  top  of  the  mantle).  However,  the  CRUST2.0 
variable  Moho  depth  was  retained  and  accommodated  by  vertical  compression  or  extension  of 
the  AK135  mantle  thickness  to  a  depth  of  210  km. 

The  free  parameters  in  the  inversion  include  the  velocities,  a  and  /?,  for  three  crustal  layers 
(upper,  middle,  lower  crust)  and  16  of  the  17  vertical  nodes  in  the  upper  mantle;  the  parameters 
at  410  km  are  held  fixed  to  AK135  values.  Additionally,  we  vary  the  Moho  depth,  or  crustal 
thickness,  as  an  explicit  unknown  parameter  function  (of  latitude  and  longitude).  As  in  the 
construction  of  the  prior  model,  changes  to  Moho  depth  are  implemented  as  a  proportional 
extension  or  compression  of  the  crustal  layers  and  upper  mantle  nodal  depths.  The  inversion 
allows  changes  to  density,  p,  by  linking  them  directly  to  changes  in  a  through  a  version  of 
Birch's  Law  (Birch,  1961).  We  do  not  allow  Qa  and  Qp  to  change  from  their  initial  values. 

We  can  thus  formulate  our  inversion  method  with  four  vectors  of  unknown  model 
parameters.  We  let  vectors  a  and  b  contain  the  layer  or  nodal  values  of  the  P-wave  and  S-wave 
velocity  functions,  a  (9, 4>,z)  and  /?(0,  <J),  z),  respectively,  and  let  r  contain  the  corresponding 
values  of  the  3D  density  function,  p(0, 4>,z).  The  fourth  vector,  h,  contains  nodal  values  of  the 
crustal  thickness  function,  which  we  denote  Hcr(6, 4>).  The  data  vectors  involved  in  the  problem 
are  t,  containing  observed  P-wave  travel  times  for  various  source-receiver  paths;  and  g, 
containing  observed  group  delays  for  various  paths  and  frequencies.  We  express  the  joint  inverse 
problem  in  terms  of  these  vectors  as 
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g  =  Fg(  a,b,r,h)  +  eg 


(5) 

(6) 


t  =  Ft(a,h)  +  et, 

where  Fg  and  Ft  are  nonlinear  forward  modeling  functions,  and  eg  and  et  are  error  vectors. 
While  not  shown,  it  is  understood  that  Fg  also  depends  on  the  fixed  parameters  Qa  and  Qp. 
Equations  (5)  and  (6)  are  coupled  through  the  dependence  of  g  and  t  on  common  parameters,  a 
and  h. 


2.4  Factoring  the  Surface-Wave  Inverse  Problem 

We  decompose  the  surface-wave  inverse  problem  in  Equation  (5)  to  mirror  the  two-stage 
procedure  we  use  in  forward  modeling  group  delays,  described  in  Section  1.2.  Let  the  vectors  u 
and  c,  respectively,  contain  values  of  the  local  group-velocity  and  phase-velocity  maps  at  the 
nodes  of  our  geographic  grid  and  at  the  observation  frequencies.  The  first  stage  in  calculating 
group  delays  for  a  model  (a,  b,  r,  h) — dispersion  calculations  in  ID  media — defines  functions  U 
and  C,  such  that 

c  =  C(a,  b,  r,  h)  (7 

u  =  t/(a,  b,  r,  h).  (8 

The  second  stage  -  integrating  group-velocity  maps  along  Fermat  paths  found  by  2D  raytracing 
through  phase- velocity  maps  -  defines  a  function  Hg,  such  that 

F5(a,b,r,h)  =  Hg(u,c).  (9 

The  dependence  of  Hg  on  u  arises  from  the  integrand  of  Equation  (3),  while  the  dependence  on  c 
is  from  the  integration  path.  Equations  (7)  -  (9)  allow  us  to  factor  the  nonlinear  inverse  problem 
of  Equation  (5)  as 

g  =  Hg(u,c)  +  eg  (10 

c  =  C(a,  b,  r,  h)  +  ec  (11 

u  =  f/(a,  b,  r,  h)  +  eu,  (12 

where,  strictly,  eu  =  ec  =  0. 


While  our  notation  does  not  show  it,  it  is  important  to  point  out  that,  as  a  consequence  of  our 
forward  modeling  assumptions,  each  of  Equations  (10)  -  (12)  actually  embodies  a  set  of 
decoupled  problems.  Equation  (10)  defines  one  inverse  problem  for  each  frequency  to  since  there 
is  no  cross-frequency  dependence  of  group  delays  on  group-velocity  maps.  Equations  (11)  and 
(12)  decouple  geographically  since,  by  assumption,  the  local  phase  and  group  velocities  at  a 
latitude/longitude  point  (6, 4>)  depend  only  on  the  ID  velocity  and  density  profiles  associated 
with  that  point. 
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2.5  Linearization 

One  step  of  our  iterative  inversion  procedure  solves  linear  approximations  to  the  surface- 
wave  and  body-wave  inverse  problems  with  certain  parameter  dependences  ignored  in  order  to 
decouple  the  problems.  Let  (aref,  bref,  rrep  hre^)  denote  the  parameters  of  a  reference  model. 
Starting  with  the  body-wave  problem,  we  linearize  Equation  (6)  as 

t  —  Ft(aref,  hreyO  +  At(a  —  a  re^)  E  et,  (13) 

where  At  is  the  Jacobian  matrix  (partial  derivative  matrix)  of  Ft  with  respect  to  a,  evaluated  at 
the  reference  model.  In  this  case,  we  ignore  the  dependence  on  crustal  thickness  h,  assumed  to 
be  fixed  at  its  reference  value. 

Linearizing  (10)  at  the  reference  model  we  get 

8  Hg  (ure/<  Ue/)  E  Bg(il  —  Ure^-)  +  Gg  (14) 

where 

Uref  fl(^re/<  ^re/' ^re/<  ^re/)  (13) 

cref  C (aref,  bref,  rref,  hre^).  (16) 

The  Jacobian  of  Hg  with  respect  to  u,  shown  as  Bg,  contains  2D  raypath  sensitivities,  analogous 
to  the  3D  raypath  sensitivities  for  body-wave  travel  times  that  comprise  At.  The  Jacobian  of  Hg 
with  respect  to  c  is  zero  since  the  Fermat  raypaths  are  stationary  with  respect  to  perturbations  in 
c  from  cref.  For  this  reason,  we  do  not  consider  the  linearization  of  Equation  (11).  Linearizing 
(12)  yields 

u  U(aref, href,  rref, h)  -I-  Bu(h)(b  b rey)  E  gu.  (17) 

Here  we  ignore  the  dependence  of  U  on  a  and  r,  but  retain  its  nonlinear  dependence  on  h.  The 
Jacobian  matrix  Bu(h)  is  derived  from  the  group-velocity  partial  derivatives  implied  by  surface- 
wave  theory  in  ID  media  (Harkrider  and  Anderson,  1966;  Harkrider,  1968). 

Under  the  same  assumptions,  we  can  also  linearize  Equation  (5),  the  unfactored  surface-wave 
inverse  problem,  to  obtain 

8  =  Fgfaref’  ^re/»  ^ref>  h)  E  A^(h)(b  —  bre^-)  +  Gg.  (18) 

Applying  the  chain  rule  to  the  composite  function  Fg  =  C),  we  find  that 

A5(h)  =  B5Bu(h).  (19) 

We  are  quick  to  point  out  that  ignoring  certain  parameter  sensitivities  will  clearly  lead  to 
suboptimal  solutions  of  the  linearized  inverse  problems.  What  is  not  clear  is  whether,  or  how, 
doing  so  affects  our  final  solution  to  the  nonlinear  joint  problem  obtained  by  the  iterative 
updating  scheme  we  use.  We  note,  however,  that  the  sensitivities  we  ignore  are  relatively  small 
so  that  Equations  (13)  -  (18)  do  retain  most  of  the  information  about  the  unknown  parameters: 
on  a  from  the  P-wave  data  and  on  /?  and  Hcr  from  the  Rayleigh-wave  data.  Moreover,  all 
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dependencies  of  data  on  parameters  are  accounted  for  in  the  final  solution  since  nonlinear 
forward  modeling  is  performed  after  each  round  of  reference  model  updating. 

The  three  linearized  inverse  problems  -  Equations  (13),  (14)  and  (17)  -  can  be  written 
collectively  as 

d  =  Flm(m)  +  e,  (20) 

where  d  is  a  data  vector,  m  is  a  model  vector,  and  e  is  an  error  vector.  The  function  Flm  is  the 
linearization  of  a  nonlinear  function  F  at  a  reference  model,  i.e. 

FUn( m)  =  F(mre/)  +  A(m  -  mre/).  (21) 

We  allow  Equation  (20)  to  stand  for  the  decoupled  versions  of  Equations  (14)  and  (17).  Thus, 
there  is  one  instance  of  (20)  for  each  frequency  in  the  surface-wave  data  set,  as  well  as  one  for 
each  geographic  point  in  the  group-velocity  maps.  Therefore,  in  all  instances,  the  model  vector 
m  becomes  a  sampled  version  of  a  model  function,  m(x),  where  x  denotes  spatial  position  and  is 
variously  3D  position  (0,  §,z),  2D  position  (0, 4>)  or  ID  position  z.  Table  1  identifies  the  data 
vector  and  model  function  for  each  of  the  linearized  inverse  problems  we  solve.  Note  that  the 
model  function  is  taken  to  be  slowness  in  the  body-wave  and  surface-wave  tomographic 
problems,  while  it  is  velocity  in  the  dispersion  inverse  problem. 


Table  1:  Data  and  Model  Associations  for  Linearized  Inverse  Problems 


Equation 

Data  Vector  ( d ) 

Model  Function 

13 

P-wave  travel  time  vs. 
source-receiver  path 

m(9,  cp,  z)  =  a1  (9,  cp,  z) 

14 

Rayleigh- wave  group  delay 
vs.  path  (fixed  co) 

m(9,  cp,  z)  =  u1  (9,  cp;  co) 

17 

Group  velocity  vs.  co  (fixed 

e,cp) 

m(z)  -  fi(9,  (p,  z) 

2.6  Updating  Procedure 

Our  iterative  inversion  procedure  starts  with  the  reference  model  (a ref,  etc.)  set  to  the  initial 
model  (aprj,  etc.).  Each  step  of  the  iteration  solves  the  linearized  inverse  problems  of  Equations 
(14),  (17)  and  (13)  in  sequence,  using  the  solutions  to  update  the  reference  model.  Specifically, 
we 

i.  Solve  (14)  to  obtain  the  solution  u. 

ii.  Solve  (17),  with  u  =  u,  to  obtain  b  and  h.  Then  set 

iiref  —  h 

b  ref  ^ 

iii.  Solve  (13)  to  obtain  a.  Then  set 

a  ref  ® 
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rr  ef 


Ip ri  ~f"  0-3  (3r ef  ®p ri) 


re/ 


■pro 


Cr ef  ^(®re/»  ^re/»  *re/»  ^re/) 
Ur  ef  —  ^  (<3 1  ■  e  f>  ^ref>  E  e  f>  hr  e  /  ) 


Step  (i)  is  done  frequency-by-frequency  and  Step  (ii)  is  done  point-wise  geographically.  After 
iterating  this  sequence  several  times,  we  take  the  latest  reference  model  to  be  the  solution  of  the 
nonlinear  joint  inverse  problem. 

Even  though  we  do  not  solve  the  linearized  inverse  problems  simultaneously,  the 
regularization  technique  and  constraints  we  apply  to  the  solutions  end  up  coupling  the  solutions 
for  a  and  b,  as  we  explain  later. 


2.7  Bayesian  Inversion  Method 

We  adopt  a  Bayesian  approach  (e.g.  Tarantola,  2005)  to  solving  each  instance  of  Equation 
(20).  We  describe  the  approach  first  in  the  absence  of  parameter  bounds,  which  we  discuss  later. 

We  assume  that  e  and  m  are  each  Gaussian  random  variables,  uncorrelated  with  each  other, 
with  prior  means  zero  and  mpri,  respectively,  and  with  prior  variance  matrices  Ve  and  Cm, 
respectively.  We  take  the  solution  of  Equation  (20)  to  be  the  maximum  a  posteriori  (MAP) 
estimator  of  m,  which  minimizes  the  objective  function  given  by 

^(m)  =  (d  -  FKn(m))TV"1(d  -  FUn( m))  +  (m  -  mpri)7’Cm1(m  -  mpr;).  (22) 

We  base  Cm  on  geostatistical  concepts  (e.g.  Deutsch  and  Joumel,  1998),  parameterizing  it  in 
terms  of  the  second-order  spatial  statistics  of  the  Gaussian  random  function  m(x).  We  specify 
these  statistics  in  terms  of  a  variance  function,  er^(x),  and  a  correlation  kernel,  C0(x,x'),  such 
that,  for  any  two  points  x  and  x', 

Cov  [m(x), m(x')]  =  om(x)  am(x')  C0(x,  x').  (23) 

We  require  C0(x,x)  =  1.  Letting  the  matrix  C0  be  a  discretized  version  of  C0(x,x'),  and  letting 
Em  be  a  diagonal  matrix  containing  samples  of  om  (x)  on  its  diagonal,  we  can  write  a  discrete 
version  of  Equation  (23)  as 

E  [(m  -  mprj)(m  -  mpri)T]  =  Cm  =  EmC0Em  (24) 

where  E[  ]  denotes  the  mathematical  expectation  operator  on  random  variables. 

We  choose  the  kernel  C0(x,  x')  to  be  a  decaying  function  of  the  spatial  separation,  x  —  x', 
with  the  decay  rate  controlled  by  a  specified  correlation  length  in  each  direction.  This  is 
achieved  by  letting  C0  be  the  Green's  function  of  a  differential  operator  D0  -  i.e. 

D0  C0  (x,  x')  =  8 (x  -  x')  (25) 


-  with  D0  defined  as 

D  =  1  [  j 

®  f('1detA(x)  L 


V  •  A2  (x)  • 


(26) 
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Here,  I  denotes  the  identity  operator;  V  is  the  gradient  operator  on  the  appropriate  spatial  domain 
(depth,  geographic  position,  or  3D  position);  and  Kt,  K2  and  F  are  constants  which  depend  on  the 
dimensionality  of  the  spatial  domain  (see  Table  2).  The  correlation  tensor,  A,  is  allowed  to  vary 
with  position  but  we  restrict  its  structure  to  be  (in  the  case  of  three  dimensions) 

A(x)  =  Az(z)  nznz  +  Ah  (ndn0  +  114,114,),  (27) 

where  nz,  etc.  are  unit  vectors  in  a  spherical  coordinate  system,  and  where  Az  and  Ah, 
respectively,  are  correlation  lengths  in  the  vertical  and  horizontal  directions. 

Table  2:  Constants  for  Regularization  Operator  in  a  Whole-Space 

Dimension  of  x _ l_ _ Kj _ K2 

1  12  1 

2  2  2n  2 

3  2  8ji  1 

In  the  3D  inversion  for  a  (Equation  (13))  and  in  the  ID  inversions  for  /?  (Equation  (17)),  we 
associate  D0  with  Neumann  boundary  conditions  (zero  normal  gradient)  at  the  Earth's  surface 
and  from  each  side  of  the  Moho  discontinuity.  The  Moho  condition  leads  to  the  de-correlation  of 
m(x)  between  the  crust  and  upper  mantle,  i.e.  C0(x,x')  =  0  when  x  and  x'  are  on  different  sides 
of  the  Moho.  Table  3  lists  the  values  of  the  geostatistical  parameters  we  used  in  each  of  the  three 
linearized  inversions. 

Numerically,  we  implement  the  geostatistical  approach  by  constructing  a  finite-difference 
matrix,  D0,  that  approximates  the  differential  operator  D0.  Given  C0  =  Dq  1  and  Equation  (24), 
the  objective  function  in  Equation  (22)  can  be  written  explicitly  in  terms  of  D0  as 

V(m)  =  (d  -  FUn(m))rVe-1(d  -  FUn( m))  +  (m  -  mpri)rS-1D02-1(m  -  mpr;).  (28) 
The  numerical  algorithms  we  use  to  minimize  lTI  will  be  mentioned  later. 


Table  3:  Geostatistical  Parameters  Used  in  Inversions 


Model  Parameter 

a1 

(slowness) 

P 

(velocity) 

u'1 

(slowness) 

Equation  in  Text 

13 

17 

14 

Std.  Dev.  (crm) 

— 

— 

3% 

Crust 

3% 

3% 

— 

Upper  Mantle 

2% 

2% 

— 

Horiz.  correlation  length  (A/,) 

350  km 

— 

350  km 

Vertical  correlation  length  (Az) 

Crust 

0.5  Hcr 

0.5  Hcr 

— 

Upper  Mantle 

70  km 

70  km 

— 
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2.8  Crustal  Thickness  Inversion 

The  linearized  surface-wave  inverse  problem,  Equation  (17),  differs  from  the  other  two 
problems  in  that  it  treats  the  crustal  thickness  vector,  h,  as  a  free  parameter.  The  objective 
function  for  this  problem  can  be  written  as 

¥(b;  h)  =  (u  -  UUn(\y,  h))rV"1(u  -  UUn(b;  h)) 

+(b  -  VO’VDoI^Cb  -  VO-  (29) 

Letting  the  minimum  of  with  respect  to  b,  with  h  fixed,  be  achieved  at  b  =  b(h),  we  can 
write  an  objective  function  for  h  as 

W(h)  =  T(b(h)).  (30) 

We  take  the  solution  for  h  to  be  the  value  minimizing  ME  Since  the  inverse  problem  decouples 
geographically,  the  minimization  is  performed  separately  for  each  (0, 4>)  in  our  model  grid, 
yielding  Hcr(0, 4>)  for  each  point.  Numerically,  we  find  the  solution  with  a  grid  search  over 
crustal  thickness  values  spanning  a  range  of  ±2.5%  from  the  current  reference  value. 

It  is  worth  noting  that  each  trial  crustal  thickness  in  a  grid  search  for  Hcr(9,  <£)  entails 
forward  modeling  calculations  for  group  velocities  at  all  the  observed  frequencies,  including  the 
computation  of  shear-wave  sensitivities,  as  well  as  the  inversion  calculations  required  to  obtain 
the  solution  shear- wave  profile,  §,z).  In  other  words,  allowing  crustal  thickness  to  be  an 
inversion  parameter  comes  at  a  high  computational  price. 

2.9  Factored  vs.  Unfactored  Surface-Wave  Inversion 

Appendix  A  shows  that  appropriate  choices  of  prior  model  means  and  prior  error  and  model 
variances  make  the  solution  of  a  nonlinear  inverse  problem,  and  of  a  factored  version  of  that 
problem,  the  same  within  a  linear  approximation.  The  equivalence  is  exact  for  linearized 
problems.  The  requirement  for  prior  means  is  easily  met;  the  prior  mean  of  u  in  solving  Equation 
(14)  is 

Upri  —  U(ilref,  bprj,  Yref,  hre^).  (31) 

However,  the  required  choice  of  error  and  model  variances  in  the  factored  problems  is 
complicated  and  difficult  to  implement.  In  particular,  the  variance  of  the  error  vector  eu  in 
Equation  (17)  would  have  to  be  set  to  the  posterior  variance  of  u  in  Equation  (14),  which  is  a 
non-diagonal  matrix  and,  in  any  case,  difficult  to  compute.  Additionally,  it  is  not  practical  to 
select  prior  model  variances  in  the  factor  problems  that  equate  to  a  desirable  prior  variance  on  b 
in  the  composite  problem,  e.g.  one  implied  by  a  3D  geostatistical  treatment  of  b  analogous  to 
that  assumed  for  a  in  the  body-wave  inversion. 

Nonetheless,  the  geostatistical  assumptions  we  did  use  for  u  and  b  in  the  factor  problems 
attempted  to  achieve  parity  with  our  3D  geostatistical  treatment  of  a  in  the  travel-time 
tomography  problem.  As  discussed  above  and  summarized  in  Table  3,  we  used  a  2D  correlation 
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kernel  for  u  and  a  ID  correlation  kernel  for  b  that  shared  the  variances  and  correlation  lengths  of 
the  3D  correlation  kernel  assumed  for  a. 


2.10  Velocity  Correlation 

In  a  Bayesian  inversion  methodology  it  is  straightforward  to  make  the  a  priori  assumption 
that  the  P  and  S  model  velocities  are  correlated.  Considering  both  a  and  p  as  functions  of  3D 
position,  we  can  express  this  as 

Cov  [a(x),  a(x')]  =  oa(x )  oa(x')  C0(x,x')  (32) 

Cov  [p (x),  p  (x')]  =  op  (x)  op  (x')  C0  (x,  x')  (33) 

Cov  [a(x),/?(x')]  =  r(x)  oa(x )  Op(x')  C0(x,x'),  (34) 

where  C0(x,x')  specifies  the  spatial  correlation  structure  of  each  velocity  function,  and  r(x) 
specifies  the  correlation  coefficient  between  velocities  at  each  spatial  point.  Separating  the 
spatial  correlation  and  velocity  correlation  in  this  way  requires  that 

r(x)  C0(x,x')  =  r(x')  C0(x,x'),  (35) 

which  forces  r  to  be  piece-wise  constant,  differing  between  points  x  and  x'  only  when 
C0(x,x')  =  0. 

A  simultaneous  inversion  for  a  and  P  can  incorporate  Equations  (32)  -  (34)  in  an  appropriate 
objective  function  of  a  and  b.  Appendix  B  shows  that  the  same  assumptions  can  be  used  in  each 
step  of  our  sequential  approach  with  an  appropriate  modification  of  the  individual  objective 
functions  for  each  inversion.  The  modifications  are  particularly  simple  under  the  restriction  that 

apri  (X)/oa  (x)  =  ppri  (x)/op  (x),  (36) 

i.e.  the  prior  means  and  standard  deviations  of  a  and  p  are  in  the  same  ratio  everywhere.  Then, 
for  example,  in  solving  Equation  (13),  the  objective  function  to  be  minimized  would  be  Equation 
(28)  with  m  =  a  and  mprj  and  Em  set  as  discretized  versions  of 

™Pri(x)  =  [1  -  r(x)]  apri (x)  +  r(x)  PrefOO  (37) 

<bn(x)  =  [1  -  r(x)2]1/2  oa (x).  (38) 

Similar  expressions  determine  the  mean  and  standard  deviation  of  b  used  in  solving  Equation 
(17). 

2.11  Parameter  Bounds 

We  constrain  the  solutions  of  Equations  (13)  and  (17)  to  be  within  prior  upper  and  lower 
bounds  on  a,  p  and  Poisson's  ratio  v,  requiring 

in  (0,4>,z)  ^  a(G,$,z)  <  ax  0 9 ,  z) 

Pmin  <  P(G,  C|),  z)  <  pmax  (9,  c|),  z) 
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(39) 

(40) 


(41) 


4*'  z)  <  V(0,cj>,z)  <  Vmax(0,  (J),  z). 

In  general,  the  constraints  at  each  point  (0,  <J),  z)  define  a  six-sided  region  of  admissible 
points  in  a-/?  space,  as  illustrated  in  Figure  2.  However,  the  bounds  in  the  separate  inversion  for 
each  velocity  comprise  a  simple  velocity  interval,  derived  by  using  reference  values  of  the  other 
velocity  to  translate  the  bounds  on  Poisson's  ratio.  That  is,  we  augment  Equation  (20),  which 
represents  each  of  the  linearized  inverse  problems  we  solve,  with  the  constraint 

inOO  —  W.(x)  —  ax(X)-  (42) 


a  (km/s) 


Figure  2:  The  default  bounds  for  the  upper  mantle  lid  on  P  velocity  (a),  S  velocity  (/?)  and 
Poisson's  ratio  (v)  are  displayed  in  or-/?  space.  The  region  shaded  green  contains  velocity  pairs 
satisfying  the  constraints.  The  horizontal  black  line  displays  the  interval  of  admissible  P  velocities 
that  results  when  the  S  velocity  is  held  to  a  reference  value  indicated  by  the  black  circle. 


For  the  P-wave  travel-time  problem,  for  example,  we  have  (for  each  x) 
m-min  max  |  (Xmin  ,  firef  j 

mmax  =  mln  [  am ax  -  Pref  J  ^2-v^L  j' 


(43) 

(44) 


This  conditional  constraint  is  illustrated  in  Figure  2  by  the  horizontal  black  line  spanning  the 
green  region  of  admissible  velocities. 


We  developed  spatially  variable  velocity  and  Poisson's  ratio  bounds  in  a  two-step  analysis. 
The  first  step  established  default  bounds,  independent  of  latitude  and  longitude,  for  each  crustal 
layer  and  as  a  function  of  depth  in  the  upper  mantle.  These  values,  listed  in  Table  4,  reflect  a 
modest  variation  from  the  AK135  model  (in  the  upper  mantle)  and  from  typical  values  from 
continental  regions  in  CRUST2.0  (in  the  crust).  The  second  step  found  the  largest  and  smallest 
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velocities  in  CRUST2.0  as  a  function  of  tectonic  regime.  Worldwide,  there  are  26  such  regimes 
(e.g.  “Platform”,  “Continental  Shelf”).  The  velocity  bounds  at  each  point  in  the  model  were 
expanded  to  accommodate  these  extrema,  for  the  appropriate  regime,  with  an  additional  0.05 
km/s  buffer. 


Table  4:  Default  Parameter  Bounds 


Depth 

a 

min 

(km/s) 

max 

P  (km/s) 
min  max 

min 

V 

max 

Upper  Crust 

5.50 

6.20 

3.00 

3.45 

0.23 

0.34 

Middle  Crust 

6.10 

6.80 

3.35 

3.80 

0.24 

0.33 

Lower  Cmst 

6.70 

7.40 

3.70 

4.10 

0.25 

0.32 

Zmoho 

7.75 

8.30 

4.30 

4.65 

0.26 

0.31 

z~  120 

7.80 

8.35 

4.35 

4.70 

0.26 

0.31 

z  =  210 

8.05 

8.55 

4.40 

4.70 

0.26 

0.31 

z  =  410 

8.75 

9.25 

4.65 

5.00 

0.26 

0.31 

To  minimize  T(m)  in  Equation  (28)  subject  to  the  constraint  (42),  we  generate  m(x)  from 
an  unconstrained  function,  q(x),  with  the  following  transformation: 


m(x)  =  mpri(x)  +  Am(x)  erf 


q(x)}  =  M(q(x)), 


(45) 


where 


A  m(x)  = 


f  "imaxW  -  wipriCx),  if  q(x)  >  0; 
1  ™min(x)  -  mpri(x),  if  q(x)  <  0. 


It  is  easily  shown  that  as  the  bounds  become  wide  compared  to  crm(x)  (|Am(x)|  » 
m(x)  -»  mpri(x)  +  (7m (x)  q(x). 

The  linear  inverse  problem  of  Equation  (20),  in  terms  of  q(x),  becomes 
problem 


(46) 

cr(x)),  then 

(47) 

the  nonlinear 


d  =  FUn(M(q))  +  e. 


(48) 


where  the  vector  q  is  a  sampled  version  of  the  function  q(x).  However,  we  do  not  simply 
substitute  Equation  (45)  into  (28)  to  obtain  an  objective  function  for  q.  Instead,  we  assume  q  to 
be  a  zero-mean  random  variable  with  unit  variance  and  the  same  spatial  correlation  kernel  as  m: 

Cov[q(x),q(x')]  =  C0(x,x')  (49) 


or 


=  C0- 


(50) 
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Figure  3:  Velocity  bounds  have  the  effect  of  modifying  a  Gaussian  prior  probability 
distribution  on  P  velocity  (green  line)  to  a  non-Gaussian  distribution  (red  line). 

Assuming  q(x )  to  be  Gaussian  has  the  effect  of  making  m(x)  non-Gaussian,  with  a  probability 
density  function  that  peaks  at  mpri(x )  and  is  coerced  to  zero  at  the  bounds,  illustrated  in  Figure 
3.  However,  in  the  limit  of  wide  bounds,  Equation  (47),  m  approaches  Gaussian  with  mean  mprj 
and  the  variance  matrix  Cm  given  in  (24). 

The  objective  function  for  q  becomes 

%(q)  =  (d  -  Fli”(M(q)))TV(T1(d  -  Flin(M(q)))  +  qrD0q.  (51) 

The  numerical  algorithm  we  used  for  minimizing  is  different  for  the  three  linearized  inverse 
problems  we  solve.  In  the  group-velocity  and  travel-time  tomography  problems  (Equations  (13) 
and  (14)),  we  perform  the  minimization  of  with  the  nonlinear  conjugate  gradients  algorithm 
described  by  Rodi  and  Mackie  (2001).  In  the  surface-wave  dispersion  inverse  problem  (Equation 
(17))  we  use  a  Gauss-Newton  technique  with  iterative  reweighting. 

3.  APPLICATION  TO  DATA 

In  our  study  we  made  use  of  observed  travel  times  from  regional  P  waves  and  frequency- 
dependent  Rayleigh-wave  group  delays  at  short  to  intermediate  distances  to  determine  the  crust 
and  upper-mantle  seismic  structure  at  depth  ranges  between  the  surface  and  approximately  300 
km  depth.  In  the  following  sections  we  provide  details  about  the  data  sets  we  utilized  and  the 
parameterization  of  the  inversion  model. 
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3.1  Travel-Time  Observations 

Our  primary  source  of  P-wave  travel  times  was  a  subset  of  the  International  Seismic  Centre 
(ISC)  catalog,  processed  according  to  the  methodology  developed  by  Engdahl,  van  der  Hilst  and 
Buland  (EHB;  Engdahl  et  al.,  1998).  We  extracted  arrivals  from  the  years  1982  -  2004  having 
event  and  station  locations  within  a  latitude  and  longitude  box  defined  as  0-60°  N  and  30-120°  E 
and  event  depths  between  0  and  200  km.  We  included  only  first-arriving  phases  denoted  Pg,  Pb 
or  Pn  and  which  were  defining  phases  for  the  EHB  locations.  We  did  not  relocate  the  events  as 
part  of  this  study,  and  therefore  filtered  the  events  carefully  to  ensure  small  epicentral 
mislocations.  In  particular,  we  required  the  secondary  azimuth  gap,  defined  as  the  largest 
azimuthal  gap  closed  by  a  single  station,  for  a  given  event  to  be  less  than  or  equal  to  130° 
(Bondar  et  al.,  2004b)  and  the  number  of  teleseismic  arrivals  to  be  at  least  15.  In  addition  to  this 
set  of  data  we  added  EHB  bulletin  picks  over  the  Tibetan  Plateau  and  southwestern  China  using 
data  from  temporary  arrays  (Li  et  al.,  2006).  The  data  set  satisfying  our  criteria  comprised 
167,384  arrivals  from  7,681  events  and  944  stations. 

We  compressed  this  data  set  by  forming  summary  events  on  a  regular  grid  having  0.5-degree 
spacing  in  latitude  and  longitude  and  containing  13  nodes  in  depth  between  0  and  200  km,  with 
the  depth  spacing  per  node  increasing  from  5  to  20  km.  For  each  summary-event  node  and  each 
station/phase  type,  a  summary  travel-time  residual  (relative  to  the  AK135  Earth  model)  was 
formed  by  averaging  the  individual  residuals  for  the  events  near  that  node.  Following  this 
compression,  stations  containing  fewer  than  25  arrivals  were  dropped  from  the  data  set.  The  use 
of  summary  events  acknowledges  the  redundant  sensitivity  of  individual  data  to  the  Earth  model 
(which  is  on  a  1 -degree  grid)  and,  combined  with  the  station-dropping  rule,  reduces  the  ray¬ 
tracing  requirements  for  the  inversion  substantially. 

The  final  database  used  in  the  body-wave  tomography  contained  104,065  arrivals  for  3,689 
summary  events  and  603  stations.  The  data  spanned  epicentral  distances  to  18.7  degrees,  and  the 
travel-time  residuals  (relative  to  AK135)  ranged  from  -9.8  to  9.9  s  with  a  root-mean- square 
(RMS)  error  of  2.6  s. 

3.2  Group- Velocity  Dispersion  Measurements 

The  surface-wave  dispersion  database  primarily  consists  of  measurements  donated  by 
Ritzwoller  and  Levshin  (1998)  and  the  Lawrence  Livermore  National  Laboratory.  Different 
portions  of  the  database  have  been  used  in  the  derivation  of  recent  models  (Ritzwoller  et  al., 
2002;  Shapiro  and  Ritzwoller,  2002;  Pasyanos,  2005).  Some  measurements  in  the  database  were 
also  made  by  the  Weston  Geophysical  research  staff  using  the  Ritzwoller  and  Levshin  (1998) 
procedures  for  measuring  group  velocities.  To  eliminate  potential  outliers  in  the  combined  data 
set,  we  performed  a  period-by-period  grooming  of  the  data,  in  which  we  retained  a  group 
velocity  measurement  if  it  was  within  two  standard  deviations  of  the  mean  group  velocity  for 
that  period.  This  exercise  resulted  in  a  database  of  98,889  fundamental-mode  Rayleigh  group 
velocity  picks  from  3,993  events,  172  stations  and  23  periods:  T  =  10-16,  18,  20,  25,  30,  35,  40, 
45,  50,  60,  70,  80,  90,  100,  120,  125  and  150  seconds.  The  surface-wave  database  is 
nonuniformly  distributed  across  the  various  periods  and  in  paths  lengths.  The  distribution  of  the 
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group  velocity  data  is  shown  in  Figure  4  as  a  function  of  period  and  path  length.  The  data  are 
dominated  by  path  lengths  between  5  and  35  degrees  periods  below  100  seconds.  Using  the 
accepted  relation  that  surface  waves  can  resolve  velocity  changes  down  to  approximately  1/3  of 
their  wavelength,  the  shear-wave  portion  of  our  model  should  be  well  constrained  to  over  200 
km. 


(a) 


30  40  50 

Distance  (degrees) 


Figure  4:  The  distribution  of  the  group  velocity  database  as  a  function  of  period  and  station-to- 
event  epicentral  distance,  (a)  The  number  of  group  velocity  measurements  (paths)  at  the  discrete 
periods  in  the  database;  (b)  the  number  of  paths  at  distance  ranges  between  0  -  90°,  in  5°  bins. 

In  Figure  5  we  show  examples  of  the  spatial  coverage  of  our  data  sets.  To  generate  these 
maps  we  used  our  extended  version  of  the  P-L  algorithms  to  calculate  sensitivities  of  the  body- 
wave  travel  times  (right  column)  and  frequency-dependent  surface-wave  delay  times  (left 
column)  in  our  data  set,  calculated  with  our  final  inversion  model.  We  then  summed  up  a 
measure  of  'hits'  in  the  model  with  respect  in  specific  depth  intervals  or  at  specific  periods  at 
each  geographic  point  in  a  0.5  x  0.5  degree  grid  in  latitude  and  longitude.  Figure  5  demonstrates 
that  the  data  sets  we  used  to  constrain  the  inversion  model  provide  excellent  coverage  of  our 
study  region. 
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Figure  5:  Examples  of  data  sensitivity  distribution,  expressed  as  loglO(hits)  for  the  surface- 
and  body-wave  data  sets  used  in  the  joint  inversion.  Left  column:  Maps  of  the  Rayleigh-wave 
group-delay  sensitivity  hits  at  20s,  60s,  and  150s;  Right  Column:  Maps  of  the  summary-event  P- 
wave  sensitivity  coverage  at  depths  just  beneath  the  Moho  Lid,  at  approximately  100  km,  and  210 
km. 
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To  quantify  the  importance  of  applying  a  2-D  Fermat  approach  to  the  forward  modeling  of 
the  group  delays  in  our  surface-wave  data  set,  we  calculated  a  measure  of  'path  wander'  for  a 
given  source-receiver  pair  as  the  maximum  deviation  from  the  curved  ray  path  to  the  great-circle 
path  (Ritzwoller  et  al.,  2002).  This  quantity  is  somewhat  tricky  to  define  in  a  finite-difference  ray 
tracing  approach  since  a  raypath  is  only  implicitly  defined  by  a  sensitivity  distribution  that  has  a 
finite  width  determined  by  such  factors  as  the  source-receiver  distance  and  the  degree  of 
heterogeneity  in  the  phase-velocity  map  for  a  given  frequency.  To  prevent  overestimating  the 
maximum  deflection  along  a  given  path,  we  calculated  the  path  wander  from  sensitivities  above 
a  threshold  chosen  to  isolate  the  equivalent  geometrical  ray  between  an  event  and  station.  Using 
this  formulation  we  found  the  maximum  lateral  deviation  from  a  great-circle  path  (a  straight  line 
between  the  station  and  the  event  in  a  Cartesian  grid)  for  all  the  arrivals  in  our  group  velocity 
database  through  the  predicted  phase-velocity  maps  calculated  with  the  final  inversion  model. 
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Figure  6:  Depiction  of  mean  lateral  path  wander  as  a  function  of  path-length  range  (vertical 
axis)  and  period  (horizontal  axis).  For  each  group  delay  in  the  database,  the  maximum  lateral 
deflection  of  the  Fermat  path  from  the  great-circle  path,  normalized  as  a  percentage  of  the  source- 
receiver  distances,  was  computed  and  assigned  to  a  200-km  path-length  bin.  The  mean  (normalized) 
deviation  for  each  bin,  as  a  function  of  period,  is  displayed  in  gray  scale.  Darker  blocks  indicate 
significant  path  wander,  which  occurs  primarily  at  shorter  periods  and  longer  path  lengths.  White 
blocks  indicate  that  a  particular  bin  either  had  limited  measurements  or  low  lateral  deviation. 

In  Figure  6  we  show  the  summary  results  of  these  calculations,  which  are  plotted  in  matrix 
form,  after  they  have  been  grouped  as  a  function  of  period  (horizontal  axis)  and  200-km-wide 
path-length  bins  (vertical  axis).  The  results  are  presented  as  the  mean  absolute  deviation, 
normalized  to  be  a  percentage  of  the  total  path  length,  among  all  the  paths  in  a  given 
distance/period  bin.  A  white  colored  block  indicates  either  no  measurements  existed  or  there  was 
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insignificant  lateral  deviation.  The  figure  demonstrates  that  there  are  patches  of  significant  path 
wander,  mainly  for  measurements  made  at  shorter  periods  (up  to  perhaps  40  seconds)  and  longer 
path  lengths.  At  longer  periods  the  predicted  phase-velocity  maps  do  not  exhibit  large  lateral 
variation,  which  is  the  primary  factor  that  deviates  rays  from  a  great-circle  path.  At  shorter  path 
lengths,  the  positions  of  the  event  and  station  are  the  most  important  factors,  since  if  they  sample 
regions  with  intervening  strong  lateral  variability  the  ray  path  will  deviate  significantly.  These 
regions  of  strong  lateral  variability  exist  in  our  model  region,  but  their  effect  is  somewhat 
masked  in  Figure  6  when  we  combine  the  results  from  many  paths  into  a  single  mean  value  at  a 
given  period.  Overall,  the  results  indicate  that  only  certain  regions  of  our  model  (e.g.  the 
Himalayan  front)  require  the  Fermat  approach,  but  we  use  it  everywhere,  since  there  is  a  small 
computational  price  exacted  by  using  the  same  algorithm  for  both  body-  and  surface-wave  data. 

4.  RESULTS  OF  JOINT  INVERSION 

We  applied  the  nonlinear  joint  inversion  method  presented  earlier  in  Section  2  to  the  data  set 
described  in  Section  3  to  construct  a  new  compressional  and  shear  velocity  model,  which  we 
have  dubbed  the  JWM  model  (for  Joint  Weston/  MIT).  The  new  model  comprises  a  set  of  P  and 
S  velocity  profiles  on  a  1°  x  1°  grid  that  is  well  resolved  within  a  latitude  and  longitude  box  of 
(10°  -  50°  N,  40°  -  110°  E).  In  the  following  sections  we  highlight  some  of  the  features  in  the 
new  model,  particularly  in  the  upper  mantle  where  the  sensitivity  of  the  data  sets  is  the  highest. 

4.1  Velocity  Heterogeneities 

In  Figure  7  we  show  the  P  and  S  velocity  variations  with  respect  to  the  AK135  model  for  the 
new  JWM  model  at  three  depths  in  the  upper  mantle:  90,  150  and  210  km.  Some  features  of  note 
in  the  new  model  include: 

•  Crustal  thickening  beneath  the  major  orogenic  zones  in  the  region,  including  the  Hindu 
Kush,  Tibetan  Plateau  and  Altai. 

•  The  presence  of  a  high-velocity  lithospheric  root  to  the  south  of  the  Himalayan  front. 

•  Stronger  lows  in  the  crust  and  uppermost  mantle  shear  velocities  under  the  Tibetan 
Plateau  than  in  the  compressional  velocities. 

•  Signatures  of  the  sedimentary  basin  structures  across  the  region  (e.g.  Tarim,  Ordos) 
persisting  into  the  uppermost  mantle. 

We  show  additional  slices  of  the  JWM  model  at  30  km  intervals  starting  at  30  km  depth  in 
Appendix  C. 
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Figure  7:  Compressional  (left)  and  shear  (right)  velocity  structure  in  the  upper  mantle  of  the 
JWM  inversion  model  (horizontal  slices  at  90  km,  150  km  and  210  km  depths).  The  velocities  are 
displayed  in  percent  deviation  from  the  AK135  model. 
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In  Figure  8  we  show  two  vertical,  or  depth,  slices  through  the  JWM  model  along  great-circle 
paths.  On  the  left  is  a  longitudinal  slice  (at  55°E)  across  the  Saudi  Arabian  Peninsula  to  the  north 
across  southern  Iran  and  into  southern  Turkmenistan.  One  notable  feature  in  this  slice  is  the  low 
velocity  area  with  respect  to  the  background  model  beneath  central  Iran,  which  may  have 
implications  for  the  active  subduction  processes  occurring  beneath  the  Eurasian  continental 
collision  zone.  The  slice  on  the  right  at  85°E  cuts  across  the  Himalayan  Front,  from  northeastern 
India  into  central  China.  Here  the  P  and  S  velocity  models  both  show  the  Himalayan  front  and 
the  thick  lithospheric  root  beneath  the  Tibetan  Plateau.  We  find  that  the  Poisson's  ratio  in  the 
lower  crust  of  the  Plateau  is  strongly  elevated,  which  agrees  with  some  previous  studies  in  the 
region  (e.g.  Owens  and  Zandt,  1997).  However,  in  contrast  with  some  travel-time  based  studies 
(e.g.  Chang  et  al.,  2008),  we  find  velocity  highs  in  the  uppermost  mantle  across  larger  portions 
of  the  Tibetan  Plateau. 
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Figure  8:  Depth  slices  along  two  great-circle  paths  (see  inset  maps  for  path  AB)  in  the  JWM 
inversion  model,  including  P  velocity  (top),  S  velocity  (middle)  and  Poisson's  ratio  (bottom).  Note 
that  a  different  color  scale  is  used  for  crust  (left  scale)  and  upper  mantle  (right  scale)  velocities. 
Some  prominent  tectonic  features  are  marked  by  vertical  black  lines  on  the  top  of  each  cross- 
section. 
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4.2  Resolution  Tests 

In  this  section  we  show  the  results  of  performing  some  tests  using  a  checkerboard  pattern  to 
assess  which  features  of  the  new  model  are  real  and  which  might  be  potential  artifacts. 
Checkerboard  tests  are  well  known  to  have  limitations  (Nolet  et  al.,  1999;  Leveque  et  al.,  1993), 
and  in  our  case  they  are  only  diagnostic  of  how  the  coupled  inversion  behaves  in  the  linearized 
sense.  The  input  model  for  our  checkerboard  experiments  consisted  of  a  lateral  pattern  of 
alternating  positive  and  negative  velocity  perturbations  centered  on  5°  x  5°  squares.  We 
perturbed  the  crustal  layers  by  ±3%  and  the  upper  mantle  layers  by  ±2%  in  both  P  and  S 
velocity.  The  checkerboard  pattern  also  varied  with  depth,  varying  in  the  same  direction  across 
the  three  crustal  layers,  and  then  alternating  at  a  five-node  interval  (approximately  every  100  km) 
in  the  upper  mantle  down  to  410  km.  We  defined  a  checkerboard  model  by  overlaying  this 
pattern  onto  our  model  determined  after  the  fifth  iteration  of  the  nonlinear  inversion.  First-order 
data  perturbations  induced  by  the  checkerbooard  pattern  were  added  to  the  observed  data.  We 
then  performed  five  additional  linearized  inversion  steps  with  the  perturbed  data,  holding  the 
raypath  sensitivities  to  the  values  calculated  with  the  fifth-iteration  model.  The  resolution  test 
was  completed  by  comparing  the  resulting  inversion  model  to  that  obtained  with  the  unperturbed 
data.  The  top  plots  in  Figure  9  show  the  recovered  P  and  S  checkerboard  pattern  as  a  depth  slice 
at  90  km.  The  bottom  of  this  figure  shows  the  profiles  of  the  fifth-iteration,  perturbed 
checkerboard  and  recovered  pattern  velocities  at  the  geographic  point  (30°  N,  60°  E).  Both 
inversions,  with  and  without  the  checkerboard-induced  data  perturbations,  used  the  same 
regularization  scheme,  velocity  bounds,  and  variance  parameters.  Additional  results  for  the 
checkerboard  resolution  test  are  provided  as  depth-horizon,  or  map-view,  slices  in  Appendix  C. 
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Figure  9:  Selected  results  from  a  checkerboard  resolution  test.  Top:  slices  of  the  recovered 
checkerboard  pattern  at  90  km.  Bottom:  the  P  and  S  velocity  profiles  at  the  geographic  point  (30° 
N,  60°  E)  for  the  fifth  iteration,  target  checkerboard  and  recovered  checkerboard  models. 

4.3  Model  Correlations 

In  Figure  10  we  examine  features  of  the  JWM  model  across  the  study  region.  Figure  10a 
shows  the  RMS  amplitude  change  in  the  compressional  and  shear  velocities  with  respect  to  the 
initial  model  as  a  function  of  depth,  averaged  over  the  regional  box  defined  by  (10  -  50°N,  40  - 
110°  E).  The  crust  between  the  surface  and  approximately  50  km  exhibits  the  most  change  from 
the  initial  model  (CRUST2.0  over  an  AK135  upper  mantle).  This  is  primarily  due  to  significantly 
higher  P  velocities  in  the  lower  crust  across  the  Tibetan  Plateau  and  much  lower  P  and  S 
velocities  in  the  entire  crust  along  the  orogenic  belts  crossing  Iran  and  Pakistan,  near  the  northern 
edge  of  the  Caspian  Sea,  and  along  the  eastern  edge  of  the  Indian  plate.  In  some  areas  the 
changes  can  range  as  high  as  ±12%  for  P  and  ±14%  for  S  from  the  values  in  the  CRUST2.0 
model,  even  in  places  where  the  crustal  thickness  did  not  change  significantly  over  the  course  of 
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the  inversion.  Peak  variations  from  the  initial  model  in  the  upper  mantle  reach  ±3.6%  for  P  and 
±4.8%  for  S  velocities. 
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Figure  10.  Features  of  JWM  across  the  study  region,  (a)  RMS  amplitude  (in  percent)  of  the 
changes  to  the  initial  model  for  the  compressional  and  shear  wave  velocities,  averaged  over  the 
study  region;  and  (b)  the  correlation  coefficient  between  compressional  and  shear  velocity  as  a 
function  of  depth  within  the  regional  limits  of  the  new  JWM  model. 

The  compressional  and  shear  velocities  are  strongly  correlated  in  the  crust  and  upper  mantle, 
which  is  an  expected  consequence  of  the  constraints  that  we  implemented  in  the  joint  inversion. 
Figure  10c  shows  the  mean  correlation  coefficient  as  a  function  of  depth  for  the  JWM  model.  We 
calculate  the  correlation  coefficient  across  the  defined  model  region  using  the  traditional  linear 
correlation  formula 
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where  A:  is  a  specific  depth  and  the  sum  is  over  the  set  of  latitudes  and  longitudes  in  the  model 
region.  The  results  in  Figure  10c  indicate  a  strong  positive  correlation  between  the  P  and  S 
velocities  at  most  depths.  The  weakest  correlation  occurs  in  the  uppermost  mantle,  where  the 
correlation  between  P  and  S  drops  close  to  0.5.  The  depths  between  approximately  70  and  120 
km  are  generally  where  both  body  and  surface  wave  data  sets  have  high  sensitivity.  To  take 
advantage  of  this,  our  inversion  constraints  in  the  upper  mantle  regions  allowed  changes  in  the  P 
and  S  velocities  that  were  less  well  correlated  with  one  another. 
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4.3  Data  Variance  Reduction 

In  Figure  1 1  and  Table  5  we  demonstrate  the  improved  fit  to  the  data  sets  achieved  by  the 
new  JWM  inversion  model.  Figure  1 1  shows  histograms  of  the  residuals  for  the  travel-time  and 
group-delay  data  used  in  the  inversion.  In  both  cases  the  joint  inversion  removed  the  mean  in  the 
data  and  lowered  the  variance  around  the  mean.  The  improved  data  fit  produced  by  the  inversion 
is  more  dramatic  for  the  travel-time  data,  and  illustrates  the  inherent  noisiness  of  the  group- 
velocity  data. 


P-Wave  Travel-Time  Residuals  Rayleigh-Wave  Group  Delays 


Figure  11.  Distribution  of  the  time  residuals  for  the  P-wave  travel-time  data  (left)  and 
Rayleigh-wave  group  delays  (right)  for  the  initial  (blue  lines)  and  final  JWM  (red  lines)  inversion 
models. 


The  data  fit  results  are  presented  in  another  fashion  in  Table  5,  which  lists  the  values  of  the 
root-mean-square  (RMS)  error  measured  for  the  P-wave  travel  times  and  Rayleigh-wave  group 
delays  with  respect  to  the  new  JWM  model.  For  comparison  purposes  we  also  include  the  fits  to 
the  initial  and  AK135  models.  The  variance  reduction  from  these  models  is  a  measure  of  the 
success  of  our  inversion  method;  in  addition,  the  variance  reduction  compared  to  AK135  reveals 
how  well  we  constructed  our  prior  model.  Table  5  illustrates  that  the  JWM  inversion  model 
significantly  lowers  the  data  variance  for  the  travel-time  and  group  velocity  data  sets  with  respect 
to  the  other  models.  For  example,  the  percent  variance  reduction  with  respect  to  the  initial  model 
is  51%  for  the  P-wave  travel  times  and  26%  for  the  Rayleigh- wave  group  delays. 

When  we  utilize  subsets  of  the  data  in  the  calculation  of  the  RMS  error  values,  we  reveal 
some  interesting  features  in  the  fits  of  the  various  models.  The  variance  reduction  for  the  far- 
regional  travel  times  is  significantly  higher  (57%)  than  for  the  near-regional  (41%),  which 
reflects  the  more  significant  changes  to  the  upper  mantle  velocities  that  are  due  to  the  increased 
sensitivity  to  the  velocity  structure  at  those  depths  in  our  data. 

The  group  delays  show  considerable  variance  reduction  with  respect  to  both  the  initial  and 
AK135  models,  especially  at  shorter  periods  (T  <  80s).  The  AK135  model  has  an  especially  poor 
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fit  to  the  group-delay  data  at  periods  between  25  -  45  seconds,  which  reflects  the  lack  of  a 
reasonable  crustal  structure  in  our  study  region  and  the  likely  presence  of  noise  in  the  group- 
velocity  data  at  those  periods.  Our  prior  model  corrects  this  bias  even  before  we  perform  the 
inversion;  however,  the  JWM  model  produces  even  better  fits  to  the  group  velocity  data 
compared  to  the  initial  model  across  the  entire  band  of  periods,  especially  at  the  short  periods  in 
our  data  set  (T  <=  20s). 

Table  5:  RMS  error  of  the  data  sets  with  respect  to  the  AK135,  initial  and  final  inversion 
models;  various  subsets  of  the  data  are  also  shown. 


Data  Set 

RMS  Error 
AK135  (s) 

RMS  Error 
Starting 
Model  (s) 

RMS  Error 
JWM(s) 

P  Travel  Times  all  distances 

2.50 

2.87 

2.02 

A  =  0  -  12° 

2.27 

2.43 

1.86 

A  =  12- -18.6° 

2.81 

3.42 

2.24 

Group  Delays  all  periods  (T) 

60.0 

35.2 

30.2 

T=  10 -20  s 

46.9 

37.5 

28.2 

T  =  25  -  45  s 

82.2 

37.6 

36.2 

T  =  50  -  80  s 

45.7 

28.8 

24.1 

T  =  90-  150  s 

28.3 

27.8 

24.7 

5.  DISCUSSION  AND  CONCLUSIONS 
5.1  Tests  for  Predictive  Capability 

As  we  noted  in  the  introduction,  a  primary  objective  of  the  study  was  to  improve  the  quality 
of  event  locations  using  data  from  the  region  covered  by  our  3D  model.  We  have  employed  two 
techniques  to  assess  the  improvement  in  location  performance  of  the  JWM  crust/upper  mantle 
velocity  model.  First,  we  have  assessed  the  ability  of  JWM  to  predict  regional  travel-time 
observations  for  a  set  of  ground-truth  (GT)  events.  Our  GT  database  of  explosions  and  shallow 
earthquakes  was  derived  from  several  sources,  including  the  EHB  bulletin,  the  Group2  Reference 
Event  List  (REL;  Bondar  et  al.,  2004a)  and  a  small  supplemental  list  developed  for  an  IASPEI 
location  workshop  (Engdahl,  2006).  In  Figure  12  we  show  the  events  in  our  GT  database  that  are 
within  the  resolved  boundaries  of  the  JWM  model  (10-50  N,  40-110  E).  The  GT  database 
contains  high-quality  epicenters,  but  the  Group2  events  do  not  include  a  large  number  of  regional 
P  and  S  observations.  Therefore,  we  cross-referenced  the  Group2  REL  events  to  the  EHB 
bulletin  (Engdahl  et  al.,  1998)  to  retrieve  a  larger  set  of  regional  observations.  This  fdtering 
exercise  produced  a  validation  dataset  of  183  explosions  and  225  earthquakes,  with  7,795  Pg,  Pb, 
or  Pn  first  arrivals  and  2,252  Sg,  Sb,  or  Sn  arrivals  observed  at  stations  in  the  box  (0-60  N,  30- 
120  E).  The  great-circle  paths  of  the  P  and  S  GT  observations  are  shown  on  the  right  side  of 
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Figure  12.  The  region  of  the  inversion  model  that  is  sampled  by  the  GT  data  raypaths  is  limited, 
which  illustrates  the  difficulty  of  validating  a  model  in  this  manner. 
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Figure  12:  Ground-truth  epicenters  (GT0-GT7)  in  the  resolved  regions  of  the  JWM  model. 
Left:  explosions  (red  stars)  and  earthquakes  (blue  squares)  within  our  region;  top  right:  great- 
circle  raypaths  for  the  EHB  P  observations  associated  with  events  in  the  GT  dataset;  and  bottom 
right:  great-circle  S  raypaths. 

We  then  predict  the  travel  times  of  our  regional  GT  observations  through  both  the  AK135 
and  JWM  models  using  our  3D  finite-difference  technique  (Podvin  and  Lecomte,  1991)  and 
determine  residuals  with  respect  to  the  GT  observed  data.  We  note  that  the  travel-time  residuals 
for  the  AK135  model  were  computed  using  the  same  finite-difference  approach  as  for  the  JWM 
model,  to  minimize  numerical  effects  that  could  influence  the  comparisons.  In  Figure  13  we 
show  the  results  of  this  exercise,  using  residual  density  plots  in  which  we  binned  and  averaged 
the  residuals  as  a  function  of  station-to-event  epicentral  distance.  The  residuals  were  binned  at 
0.25°  in  epicentral  distance  and  0.5  seconds  in  time.  The  top  two  subplots  in  Figure  13  show  the 
binned  residuals  relative  to  the  AK135  model  for  the  GT  P  (left  column)  and  S  (right  column) 
travel-time  data,  while  the  bottom  subplots  show  the  results  of  the  same  calculation  for  the  JWM 
model.  We  also  overlaid  the  estimated  residual  median  and  spread  at  1°  increments  in  distance 
(blue  dots  and  vertical  lines,  respectively).  The  specific  bin  counts  are  shown  in  the  color  scale  at 
the  bottom  of  each  column;  these  were  chosen  to  help  emphasize  the  tails  of  the  residual 
distributions. 
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The  results  for  the  P  data  clearly  indicate  the  improved  fit  produced  by  the  JWM  model 
across  all  epicentral  distances.  The  AK135  model  is  in  general  too  fast  at  the  shorter  distances 
and  too  slow  at  the  longer  distances.  The  spread  values  in  the  residual  plots  also  indicate  that 
there  are  likely  some  arrivals  in  the  groomed  GT  database  that  have  been  incorrectly  included, 
especially  at  crossover  distances  between  Pg  and  Pn,  or  between  Pn  and  P. 

The  GT  S  data  appear  to  suffer  more  significantly  from  the  presence  of  bad  picks  in  the 
arrival  bulletin,  especially  at  distances  greater  than  approximately  8°.  At  the  shorter  distances  it 
is  clear,  however,  that  the  JWM  model  produces  a  better  fit  to  the  GT  data.  The  scatter  of  the  S 
GT  data  is  not  unexpected  since  regional  S  arrivals  are  often  difficult  to  identify  and  pick,  being 
often  downweighted  in  location  procedures  as  a  result.  We  did  not  perform  a  rigorous  outlier 
analysis  of  the  GT  S  data,  mainly  because  there  is  not  enough  of  them  in  our  ground-truth 
database.  We  relied  instead  on  a  simple  hard  cutoff  of  ±10  seconds  as  a  criterion  for  rejecting 
outlier  residuals.  This  issue  needs  to  be  addressed  in  future  efforts,  since  the  GT  S  data  are  a 
valuable  model  validation  resource. 
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Figure  13:  GT  travel-time  residuals  binned  as  a  function  of  epicentral  distance  for  the  AK135 
(top)  and  JWM  (bottom)  models.  Residual  densities  are  binned  at  0.25°  intervals  in  distance  and 
0.5s  in  time.  The  bin  hit  counts  are  shown  in  the  color  scale  at  the  bottom  of  each  column,  and  the 
median  and  corresponding  spread  at  1°  intervals  are  shown  as  blue  circles  and  vertical  lines. 
Subplots  on  the  left  show  the  P  residuals,  and  subplots  on  the  right  show  the  S  residuals. 

In  a  second  validation  test  we  relocated  a  subset  of  events  in  our  GT  database  to  test  the 
epicentral  location  accuracy  of  the  JWM  model.  To  eliminate  some  of  the  effects  that  network 
geometry  can  have  on  the  solutions,  we  filtered  the  GT  data  to  include  only  those  events  whose 
regional  station  distribution  within  our  model  has  a  secondary  azimuthal  gap  less  than  180°. 
Secondary  azimuth  gap  (the  largest  azimuth  gap  when  a  single  station  of  the  network  is 
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removed)  is  a  good  proxy  for  the  quality  of  the  network  coverage  (Bondar  et  al.,  2004b).  We  also 
removed  stations  that  were  within  2.5°  of  the  event.  This  filtering  reduced  the  testing  data  set  to  a 
list  of  18  explosions  and  112  earthquakes  within  our  region,  with  7,1 1 1  regional  P-wave  arrivals 
and  1,935  regional  S-wave  arrivals.  Figure  14  shows  the  locations  of  our  relocation  events, 
color-coded  according  to  the  secondary  azimuth  gap  of  the  regional  arrivals. 
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Figure  14:  Map  of  events  (112  earthquakes,  18  explosions)  from  our  GT  database  that  meet 
regional  network  criteria  for  the  epicenter  relocation  exercise.  The  events  are  color-coded  by  their 
regional  secondary  azimuth  gap.  Several  clusters  of  events  are  labeled  by  location. 

We  used  the  Grid-search  Multiple-Event  Location  (GMEL)  algorithm  (Rodi,  2006)  to 
relocate  the  events  shown  in  Figure  14  with  subsets  of  the  regional  arrival  bulletin.  The  GT 
relocation  experiments  were  done  with  the  event  depths  fixed  to  their  reported  bulletin  values 
and  origin  times  allowed  to  vary.  The  arrival-time  errors  were  set  to  1.0  s  for  P  and  2.0  s  for  S 
observations.  Figure  15  directly  compares  the  epicenter  mislocations  achieved  using  AK135 
versus  JWM  travel-time  predictions.  Events  that  fall  above  the  black  unity  line  indicate  that  the 
JWM  model  relocates  an  event  closer  to  the  published  GT  value.  In  Figure  15a  the  results  show 
that,  for  event  locations  based  on  P  arrivals  alone,  JWM  performs  better  than  AK135  for  all  of 
the  explosions  and  most  of  the  earthquakes.  We  also  performed  relocations  including  both  P  and 
S  arrivals,  and  the  results  in  Figure  15b  show  that  JWM  still  outperforms  the  AK135  model, 
although  not  as  decisively  as  when  only  P  arrivals  are  used.  It  is  clear  that  adding  the  S  arrivals 
has  a  slightly  more  harmful  effect  on  the  locations  for  JWM  than  for  AK135.  However,  we  note 
that  JWM  produces  smaller  mislocations  overall. 
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Figure  15:  Epicenter  Dislocations  resulting  with  the  AK135  versus  JWM  models  for  the  set  of 
events  (red  stars:  explosions;  blue  squares:  earthquakes)  shown  in  Figure  14.  Events  that  fall  in  the 
gray  shaded  region  above  the  black  unity  line  in  each  plot  indicate  'wins'  for  the  JWM  model,  (a) 
Relocation  results  using  regional  P  arrivals;  (b)  Relocation  results  using  regional  P  and  S  arrivals. 


5.2  Model  Uncertainty 

A  final  task  in  our  project  was  to  find  techniques  that  can  be  used  to  calculate  the  uncertainty 
of  the  new  JWM  model.  The  Bayesian  framework  on  which  our  joint  inversion  method  is  based 
provides  a  formal  framework  for  model  uncertainty,  whereby  model  uncertainty  is  characterized 
by  the  posterior  probability  distribution  on  model  parameters  inferred  from  the  prior  distributions 
on  the  parameters  and  on  data  errors.  A  number  of  factors  make  it  difficult  to  implement  this 
framework  in  a  complete  way.  First,  for  large,  nonlinear  inverse  problems  in  general,  it  is  only 
practical  to  calculate  posterior  parameter  distributions  under  the  approximation  of  linearization. 
In  addition,  our  particular  choice  of  algorithms  presents  further  roadblocks  to  a  complete 
calculation,  even  for  the  linearized  problem.  In  particular,  in  solving  the  body-wave  and  surface- 
wave  linearized  problems  sequentially,  rather  than  simultaneously,  we  preclude  a  full  Bayesian 
uncertainty  analysis  for  the  joint  body- wave/surface- wave  inverse  problem.  Moreover,  our 
factoring  of  the  surface-wave  problem  stymies  a  full  Bayesian  uncertainty  analysis  of  that 
problem,  even  considered  alone.  For  these  reasons,  our  uncertainty  analysis  as  of  this  writing  is 
restricted  to  the  linearized  body-wave  travel-time  tomography  problem,  without  consideration  of 
the  information  the  surface-wave  data  may  provide  about  P  velocity  or  of  the  potential  trade-offs 
among  P  velocity,  S  velocity  and  crustal  thickness. 

As  described  in  Section  2.7,  our  application  of  the  Bayesian  framework  employs  a 
geostatistical  parameterization  of  the  prior  model  variance,  involving  a  correlation  kernel 
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C0(x,x')  to  specify  the  correlation  coefficient  between  the  model  function  -  in  this  case  P 

velocity  -  at  any  two  spatial  points  x  and  x'.  The  correlation  kernel  is  parameterized  in  terms  of 
correlation  lengths  in  the  horizontal  and  vertical  directions.  In  addition,  a  function  om(x)  specifies 
the  prior  standard  deviation  (root  variance)  of  the  model  function  at  each  position.  The  Bayesian 
methodology  for  linear  inverse  problems  straightforwardly  defines  posterior  versions  of  these 
same  quantities:  CJW!0(x,x')  and  (fosm(x).  As  part  of  our  linear  inversion  algorithms,  we 

developed  numerical  techniques  for  computing  the  prior  and  posterior  variance  of  the  model 
function  and  a  “slice”  of  the  prior  and  posterior  correlation  kernels  for  a  particular  fiducial  points 
xo,  e.g.  Co(x,  xo)  as  a  function  of  x. 

Figures  16  and  17  show  examples  of  model  correlation  slices  for  various  fiducial  points, 
computed  for  the  JWM  P-wave  velocity  inversion  model.  In  Figure  16  we  show  prior  (top  and 
posterior  (middle)  correlation  slices  for  two  fiducial  point  near  the  center  of  our  study  region 
(30°  N,  75°  E)  at  depths  60  km  and  120  km.  Note  the  de-correlation  between  the  crust  and  upper 
mantle  velocity  in  the  prior  slice,  which  is  an  intentional  feature  discussed  in  Section  2.7. 
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Figure  16:  Vertical  cross-sections  through  prior  (top)  and  posterior  (middle)  correlation  slices 
for  the  JWM  P-wave  model.  Slices  are  shown  for  two  fiducial  points  in  the  middle  of  the  study 
region  (30°  N,  75°  E)  at  depths  of  60  km  (left)  and  120  km  (right).  Maps  at  the  bottom  of  the 
subplots  show  the  selected  cross-sections  as  the  AB  lines.  The  number  label  "Std.  Dev."  is  the  model 
standard  deviation  at  the  fiducial  point. 

The  posterior  correlation  slice  includes  the  information  from  travel-time  data,  leading  to  a 
reduction  in  model  variance  and  in  the  correlation  lengths  implied  by  the  correlation  decay  from 
the  fiducial  point.  Strikingly,  some  posterior  correlations  are  negative,  especially  between  crust 
and  mantle  velocities  (e.g.  the  left/middle  plot). 
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In  contrast  to  Figure  16,  the  correlation  slices  in  Figure  17  correspond  to  fiducial  points  near 
the  easternmost  side  of  our  study  region  (15°  N,  105°  E)  and  (45°  N,  105°  E)  at  120  km  depth.  In 
these  cases  the  prior  and  posterior  correlations  and  standard  deviations  are  more  similar  to  each 
other  because  there  is  a  dearth  of  travel-time  data  to  constrain  the  model  at  these  points. 
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Figure  17:  More  examples  of  vertical  cross-sections  through  prior  (top)  and  posterior  (middle) 
correlation  slices  for  the  JWM  P-wave  model.  Slices  are  shown  for  two  fiducial  points  (15°  N,  105° 
E)  and  (45°  N,  105°  E)  on  the  eastern  edge  of  the  study  region  at  120  km  depth.  Plotting 
conventions  are  the  same  as  in  Figure  16. 

5.3  Summary 

In  this  study  we  have  demonstrated  the  application  of  a  nonlinear  joint  inversion  of  body- 
wave  travel  times  and  surface-wave  group  velocities  to  data  from  a  broad  region  in  south-central 
Asia.  The  forward  modeling  incorporated  in  our  inversion  utilizes  fully  3D  ray  tracing  for  the 
body-wave  travel  times,  and  a  two-step  procedure  for  the  surface-wave  dispersion  data  that 
includes  1-D  dispersion  modeling  at  a  geographic  point  followed  by  2-D  ray  tracing  in  the  phase 
velocity  maps.  We  numerically  solved  the  inverse  problem  using  a  set  of  iterated  inversion  steps. 
Consistency  between  the  P  and  S  velocities  was  achieved  by  imposing  bounds  on  Poisson's  ratio 
and  by  invoking  a  regularization  constraint  that  correlates  variations  in  P  and  S  velocity  from  an 
initial  model.  The  resulting  inversion  model  shows  good  agreement  with  the  persistent  features 
seen  in  previous  seismic  velocity  models  produced  from  separate  body-  or  surface-wave  data 
sets,  as  well  as  some  intriguing  differences  between  the  compressional  and  shear  wave  structure. 
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Our  primary  goal  in  developing  the  new  inversion  approach  and  the  JWM  model  was  to 
improve  regional  seismic  event  location  capability  in  a  strongly  heterogeneous  crust  and  upper 
mantle.  We  have  tested  the  JWM  model  for  its  predictive  capabilities  using  data  from  a  large 
database  of  ground-truth  events,  which  were  held  out  from  the  joint  inversion.  The  tests  include 
the  relocation  of  the  ground-truth  events,  using  data  sets  of  Pn-only  and  Pn/Sn  arrivals,  and  the 
direct  comparison  of  predicted  Pn  and  Sn  travel  times  to  the  ground-truth  observations.  Both 
types  of  tests  indicate  that  our  3D  inversion  model  has  much  better  predictive  capability  than 
either  a  1-D  global  model  or  our  initial  model,  which  comprised  a  3D  crustal  structure  overlying 
a  1-D  mantle.  Our  S  velocity  model  performed  reasonably  well  in  predicting  Sn  times,  even 
though  such  data  were  not  included  in  our  joint  inversion. 
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8.  APPENDIX  A:  Factored  Inversion 


We  consider  an  inverse  problem  for  m  of  the  form 

d  =  F(G(  m))  +  e,  (Al) 

where  d  is  a  data  vector,  m  is  a  model  vector,  e  is  an  error  vector,  and  F  and  G  are  nonlinear 
transformations  whose  composition  is  the  forward  model  for  the  problem.  We  take  the  solution 
of  (Al)  to  be  the  model  minimizing  the  objective  function 

^(m)  =  (d  -  F(G(m)))7’W(d  -  F(G(m)))  +  (m  -  m0)TD(m  -  m0).  (A2) 

In  the  context  of  stochastic  inversion,  m0  is  the  prior  mean  of  m,  D  is  the  inverse  of  the  prior 
covariance  operator  of  m,  and  W  is  the  inverse  of  the  prior  covariance  operator  of  e. 

Define  the  objective  functions 

^i(u)  =  (d  -  F(u))rW(d  -  F(u))  +  (u  -  G(m0))TD1(u  -  G( m0))  (A3) 

^(m)  =  (u  -  G(m))TW2(u  -  G(m))  +  (m  -  m0)7’D2(m  -  m0).  (A4) 

We  will  show  that,  under  linear  approximations  to  F  and  G  and  for  suitable  choices  of  the 
operators  Dl5  D2  and  W2,  the  vector  m  minimizing  W2,  where  u  minimizes  Vi,  is  the  same 
vector  that  minimizes  ’P  in  (A2).  This  in  effect  factors  the  inverse  problem  of  Equation  (Al)  into 
two  inverse  problems  having  forward  transformations  F  and  G,  respectively: 

d  =  F(u)  +  e1  (A5) 

u  =  G(m)  +  e2.  (A6) 

We  see  that  u  acts  as  the  model  vector  in  the  first  problem  and  the  data  vector  in  the  second. 

We  start  by  expressing  the  composite  inverse  problem  as  the  stationarity  condition  obtained 
by  equating  the  gradient  of ’P  to  zero: 

D(m  -  m0)  =  BrArW(d  -  F(G(m)))f  (A7) 

where  the  linear  transformation  A  is  the  Jacobian  of  F  evaluated  at  G  (m),  and  B  is  the  Jacobian 
of  G  evaluated  at  m.  The  stationarity  conditions  for  the  factor  problems  are  similarly  given  by 

Dx(u  -  G(m0))  =  ArW(d  -  F( u))  (A8) 

D2(m  —  m0)  =  BtW2(u  —  G(m)).  (A9) 

Presuming  that  the  same  value  of  m  satisfies  Equations  (A9)  and  (A7),  as  it  is  our  objective  to 
prove,  the  Jacobian  of  G  in  each  equation  is  the  same,  allowing  us  to  denote  both  as  B.  However, 
the  value  of  u  satisfying  (A8)  is  not  necessarily  G(m),  in  which  case  the  Jacobian  of  F,  shown  as 
A  in  both  equations,  may  differ.  Therefore,  we  will  replace  F  henceforth  with  its  linear 
approximation,  such  that 

F(u)  *  F(G(m))  +  A(u  -  F(G(m))),  (A10) 
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where,  as  in  Equation  (A7),  A  is  the  Jacobian  of  F  evaluated  at  G(m).  Using  (A  10),  the 
stationarity  condition  for  the  first  factor  inverse  problem  becomes 

Di(u  -  G(m0))  =  ArW(d  -  F(G(m))  -  Au  +  AG(m).  (A1 1) 

Equation  (A1 1)  is  solved  by 

u  =  G(m0)  +  (ArWA  +  DO-1 

ArW(d  -  F(G( m))  +  AG(m)  -  AG(m0))  (A12) 

which  implies 

u  -  G(m)  =  (AtWA  +  DO-1 

[Di(G (m0)  -  G(m))  +  ATW(d  -  F(G(m)))].  (A13) 

Substituting  this  into  (A9)  yields 

D2(m  -  m„)  =  BrW2(ArWA  +  DO-1 

[Di(G(m0)  -  G(m))  +  ArW(d  -  F(G(m)))].  (A14) 

Now  let 

W2=ArWA  +  D1.  (A  15) 

Equation  (A14)  becomes 

D2(m  -  m0)  =  BTD1(G(m0)  -  G(m))  +  BTATW(d  -  F(G(m))).  (A16) 


Up  to  this  point  we  have  not  needed  to  approximate  the  function  G,  but  now  we  do. 
Linearizing  about  m,  let 

G(m0)  w  G(m)  +  B(m0  —  m).  (A17) 

Applying  this  in  the  first  term  of  Equation  (A16),  we  get 

D2(m  -  m0)  =  BTD1B(m0  -  m)  +  BTATW(d  -  F(G(m))).  (A18) 

This  becomes  Equation  (A7),  the  stationarity  condition  for  the  composite  inverse  problem,  by 
setting 

D  =  BrD1B  +  D2.  (A  19) 

The  equivalence  of  the  composite  inverse  problem  and  factor  inverse  problems  occurs  when 
W2,  D1  and  D2  are  chosen  in  accordance  with  (A15)  and  (A19).  Equation  (A15)  sets  the  prior 
covariance  operator  of  e2  in  the  second  factor  problem  to  the  posterior  covariance  of  u  resulting 
from  the  first  inverse  problem.  Equation  (A  19)  relates  the  prior  model  covariances  of  the 
composite  and  factor  problems. 
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9.  APPENDIX  B:  Conditional  Inversion 


We  consider  an  objective  function  for  a  nonlinear  inverse  problem  for  parameter  vectors  a 
and  b,  given  by 

mb)  =  (t  -  Ft(a))rWt(t  -  Ft( a))  +  (g  -  Fg( b)f  W5(g  -  F/b)) 

+(a  -  a0)rDaa(a  -  a0)  +  (a  -  a0)rDab(b  -  b0) 

+(b  -  b0)TD,a(a  -  a0)  +  (b  -  b0)rDb,(b  -  b0).  (Bl) 

The  vectors  a0  and  b0  play  the  role  of  prior  means  of  a  and  b,  respectively.  Denoting 


D  1  plays  the  role  of  a  variance/covariance  matrix  on  a  and  b.  We  will  show  that  the  conditional 
minimum  of  ’P  with  respect  to  a,  holding  b  fixed,  also  minimizes  the  objective  function 

V*(a)  =  (t  -  Ft(a))TWt(t  -  Ft(a))  +  (a  -  a£fD*a(a  -  aj)  (B3) 

for  appropriate  choices  of  a£  and  T)*aa- 

While  this  can  be  shown  in  general,  we  will  restrict  attention  to  a  parameter 
variance/covariance  matrix  of  the  form 


where  Ia  and  If,  are  diagonal  matrices  containing  the  standard  deviations  of  the  components  of 
a  and  b,  respectively;  C0  is  a  correlation  matrix  for  both  a  and  b;  and  R  is  a  diagonal  matrix  of 
correlation  coefficients  between  a  and  b.  We  require  that  R  and  C0  commute: 

RC0  =  C0R.  (B5) 

Further,  we  restrict  Ia  and  as  indicated  in  Equation  (B9)  below.  Given  (B4),  we  have 


where  D0  =  Cq1  and  S  =  I  —  R2.  Note  that  Equation  (B5)  ensures  that  R  and  S  each  commute 
with  D0  (as  they  do  with  Ia  and  If,). 

To  find  ao  and  D„a,  we  start  by  writing  the  stationarity  condition  implied  by  setting  the 
gradient  of  with  respect  to  a  to  zero: 

Daa(a  -  a„)  +  Dol,(b  -  b„)  =  AlWt(t  -  Ft(a)).  (B7) 

The  conditional  minimum  of  4J,  for  any  fixed  b,  is  achieved  by  solving  (B7)  for  a.  Equation 
(B6)  allows  us  to  write 

Daa(a  -  a0)  +  DaZ, (b  -  b0)  =  S^I^DoK^a  -  a0)  -  RI^O*  -  b0)].  (B8) 
Assume  the  parameter  standard  deviations  and  prior  means  are  in  a  common  ratio: 
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^ala  0  —  ^b^O- 

(B9) 

Then 

Daa(a  a0)  T  Dafc (b  bo)  Daa(a  a0) 

(BIO) 

where 

ao  =  (I  —  R)a0  +  R -Za^b 

(B 11) 

Daa  =  S^E^DoE-1  =  S-1Daa. 

(B12) 

Plugging  (BIO)  into  the  stationarity  condition,  (B7),  gives 

D^a  (a  —  ao)  =  AfWt(t  —  Ft(a)), 

(B13) 

which  is  the  stationarity  condition  associated  with  the  objective  function  4**  in  Equation  (B3). 

Equation  (B1 1)  states  that  a£  is  a  weighted  average  of  a 0  and  the  scaled  version  of  b  given 
by  The  relative  weighting  of  the  two  vectors  is  determined  by  the  correlation- 

coefficient  matrix  R,  with  R  =  0  (no  correlation  between  a  and  b)  resulting  in  a<j  =  a 0,  as  one 
should  expect.  Equation  (B12)  states  that  D*aa  is  Daa  inflated  by  S-1,  where  S  =  I  when  R  =  0. 
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10.  APPENDIX  C:  Additional  Inversion  Results 

In  the  next  four  figures  we  show  the  map  view  slices  of  the  JWM  model  at  30-km  intervals, 
starting  at  30  km  below  the  surface.  The  results  are  provided  in  percent  deviation  from  the 
AK135  model. 
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Figure  Cl.  Map  view  slices  of  the  JWM  model  in  percent  deviation  with  respect  to  AK135  at  30 
km,  60  km,  and  90  km  depth.  P  velocities  are  shown  on  the  left  and  S  velocities  on  the  right. 
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Figure  C2.  Map  view  slices  of  the  JWM  model  in  percent  deviation  with  respect  to  AK135  at 
120  km,  150  km,  and  180  km  depth.  P  velocities  are  shown  on  the  left  and  S  velocities  on  the  right. 
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Figure  C3.  Map  view  slices  of  the  JWM  model  in  percent  deviation  with  respect  to  AK135  at 
210  km,  240  km,  and  270  km  depth.  P  velocities  are  shown  on  the  left  and  S  velocities  on  the  right. 
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Figure  C4.  Map  view  slices  of  the  JWM  model  in  percent  deviation  with  respect  to  AK135  at 
300  km  depth.  P  velocities  are  shown  on  the  left  and  S  velocities  on  the  right. 

Next  we  show  the  results  of  the  checkerboard  test,  again  by  plotting  map  view  slices  of  the 
retrieved  checkerboard  model  at  30-km  intervals,  starting  at  30  km  below  the  surface.  The  results 
are  provided  in  percent  deviation  from  the  final  JWM  model  velocities. 
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Figure  C5.  Map  view  slices  of  the  recovered  P  (left)  and  S  (right)  checkerboard  models  at 
depths  of  30  km,  60  km,  and  90  km. 
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Figure  C6.  Map  view  slices  of  the 
depths  of  120  km,  150  km  and  180  km. 
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Figure  C7.  Map  view  slices  of  the  recovered  P  (left)  and  S  (right)  checkerboard  models  at  a 
depth  of  210  km,  240  km  and  270  km. 
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Figure  C8.  Map  view  slices  of  the  recovered  P  (left)  and  S  (right)  checkerboard  models  at 
depths  of  300  km. 
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